Background information

explain why I chose smoking data and maybe where my data comes from?? This data package contains the data that powers the chart “Share of adults who smoke” on the Our World in Data website https://datacatalog.worldbank.org/search/dataset/0037712/World-Development-Indicators

Research question

How does the prevalence of smoking in adults across the world vary in the dataset, and what trends or patterns emerge from this analysis?

Project Organization

The PSY6422_smoke repository is organized into key sections to help you navigate its contents. The /codebook folder provides detailed documentation on the dataset, including variable descriptions and structure, offering essential context for the analysis. The /data folder contains the raw datasets used in this project, forming the basis of all analyses. The /figures folder showcases visualizations and plots created during the analysis, highlighting the project’s key findings and insights. Lastly, the /scripts folder includes all the code used for data processing, analysis, and visualization. Together, these sections guide you through the project workflow, from raw data to final outputs.

Data set

data set origirs

The raw dataset for this visualization project comes from : Multiple sources compiled by World Bank (2024) – processed by Our World in Data. “Prevalence of current tobacco use (% of adults)” [dataset]. World Health Organization (via World Bank), “World Development Indicators” [original data]. Source: Multiple sources compiled by World Bank (2024) – processed by Our World In Data

The percentage of the population ages 15 years and over who currently use any tobacco product (smoked and/or smokeless tobacco) on a daily or non-daily basis. Tobacco products include cigarettes, pipes, cigars, cigarillos, waterpipes (hookah, shisha), bidis, kretek, heated tobacco products, and all forms of smokeless (oral and nasal) tobacco. Tobacco products exclude e-cigarettes (which do not contain tobacco), “e-cigars”, “e-hookahs”, JUUL and “e-pipes”. The rates are age-standardized to the WHO Standard Population.

limitations

These considerations are important when interpreting the project’s results Estimates for countries with irregular surveys or many data gaps have large uncertainty ranges, and such results should be interpreted with caution.

Data preparation

install packages

# List of packages to install and load
packages <- c("tidyverse", "ggplot2", "tidyr", "dplyr", "plotly", "rnaturalearth", "rnaturalearthdata", "sf")

# Function to install packages and load them
install_and_load <- function(packages) {
  for (package in packages) {
    if (!require(package, character.only = TRUE)) {
      install.packages(package, dependencies = TRUE)
      library(package, character.only = TRUE)
    } else {
      library(package, character.only = TRUE)
    }
  }
}

# Run the function
install_and_load(packages)
## Loading required package: tidyverse
## Warning: package 'ggplot2' was built under R version 4.4.2
## Warning: package 'tidyr' was built under R version 4.4.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Loading required package: plotly
## Warning: package 'plotly' was built under R version 4.4.2
## 
## Attaching package: 'plotly'
## 
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following object is masked from 'package:graphics':
## 
##     layout
## 
## Loading required package: rnaturalearth
## Warning: package 'rnaturalearth' was built under R version 4.4.2
## Loading required package: rnaturalearthdata
## Warning: package 'rnaturalearthdata' was built under R version 4.4.2
## 
## Attaching package: 'rnaturalearthdata'
## 
## The following object is masked from 'package:rnaturalearth':
## 
##     countries110
## 
## Loading required package: sf
## Warning: package 'sf' was built under R version 4.4.2
## Linking to GEOS 3.12.2, GDAL 3.9.3, PROJ 9.4.1; sf_use_s2() is TRUE

# Load data 

``` r
# Load raw data
rawdata <- read.csv("data/smoking.csv")

creating variables

In order to create my visualization, I had to create new variables. I got variable world from the function sf, which contains the resources necessary for my analysis. I then merged my world data and my smoking data into ‘map_data’ via the ISO code.

cleaning data

after my initial sanity check, I started to clean my data. The data contained specific entities such as different regions of the globe and the different income levels. In order to visualize the data I had to take these out. Further more, I wanted to visualize the data by 5 years so i got rid of 2018 and 2019. I also renamed the variable prevalence for ease.and fixed my missing isos

# Clean data: Remove specific entities
countries_data <- rawdata[!rawdata$Entity %in% c("East Asia and Pacific (WB)", "Sub-Saharan Africa (WB)", 
                                                 "Upper-middle-income countries", "Europe and Central Asia (WB)", 
                                                 "World", "European Union (27)", "Low-income countries", 
                                                 "Lower-middle-income countries", "Middle East and North Africa (WB)", 
                                                 "Middle-income countries", "North America (WB)", "South Asia (WB)", 
                                                 "Latin America and Caribbean (WB)", "High-income countries"), ]

# Further clean data: Exclude years 2018 and 2019
countries_data <- countries_data %>%
  filter(!(Year %in% c(2018, 2019)))

# Rename the column
countries_data <- countries_data %>%
  rename(Prevalence = Prevalence.of.current.tobacco.use....of.adults.) 

#sanity check

# Load world map data
world <- ne_countries(scale = "medium", returnclass = "sf")  # 'sf' format for spatial data

# Check the structure of the geospatial data
head(world)
## Simple feature collection with 6 features and 168 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -73.36621 ymin: -22.40205 xmax: 109.4449 ymax: 41.9062
## Geodetic CRS:  WGS 84
##        featurecla scalerank labelrank sovereignt sov_a3 adm0_dif level
## 1 Admin-0 country         1         3   Zimbabwe    ZWE        0     2
## 2 Admin-0 country         1         3     Zambia    ZMB        0     2
## 3 Admin-0 country         1         3      Yemen    YEM        0     2
## 4 Admin-0 country         3         2    Vietnam    VNM        0     2
## 5 Admin-0 country         5         3  Venezuela    VEN        0     2
## 6 Admin-0 country         6         6    Vatican    VAT        0     2
##                type tlc     admin adm0_a3 geou_dif   geounit gu_a3 su_dif
## 1 Sovereign country   1  Zimbabwe     ZWE        0  Zimbabwe   ZWE      0
## 2 Sovereign country   1    Zambia     ZMB        0    Zambia   ZMB      0
## 3 Sovereign country   1     Yemen     YEM        0     Yemen   YEM      0
## 4 Sovereign country   1   Vietnam     VNM        0   Vietnam   VNM      0
## 5 Sovereign country   1 Venezuela     VEN        0 Venezuela   VEN      0
## 6 Sovereign country   1   Vatican     VAT        0   Vatican   VAT      0
##     subunit su_a3 brk_diff      name name_long brk_a3  brk_name brk_group
## 1  Zimbabwe   ZWE        0  Zimbabwe  Zimbabwe    ZWE  Zimbabwe      <NA>
## 2    Zambia   ZMB        0    Zambia    Zambia    ZMB    Zambia      <NA>
## 3     Yemen   YEM        0     Yemen     Yemen    YEM     Yemen      <NA>
## 4   Vietnam   VNM        0   Vietnam   Vietnam    VNM   Vietnam      <NA>
## 5 Venezuela   VEN        0 Venezuela Venezuela    VEN Venezuela      <NA>
## 6   Vatican   VAT        0   Vatican   Vatican    VAT   Vatican      <NA>
##   abbrev postal                        formal_en
## 1  Zimb.     ZW             Republic of Zimbabwe
## 2 Zambia     ZM               Republic of Zambia
## 3   Yem.     YE                Republic of Yemen
## 4  Viet.     VN    Socialist Republic of Vietnam
## 5   Ven.     VE Bolivarian Republic of Venezuela
## 6   Vat.      V        State of the Vatican City
##                            formal_fr              name_ciawf note_adm0 note_brk
## 1                               <NA>                Zimbabwe      <NA>     <NA>
## 2                               <NA>                  Zambia      <NA>     <NA>
## 3                               <NA>                   Yemen      <NA>     <NA>
## 4                               <NA>                 Vietnam      <NA>     <NA>
## 5 República Bolivariana de Venezuela               Venezuela      <NA>     <NA>
## 6                               <NA> Holy See (Vatican City)      <NA>     <NA>
##            name_sort name_alt mapcolor7 mapcolor8 mapcolor9 mapcolor13  pop_est
## 1           Zimbabwe     <NA>         1         5         3          9 14645468
## 2             Zambia     <NA>         5         8         5         13 17861030
## 3        Yemen, Rep.     <NA>         5         3         3         11 29161922
## 4            Vietnam     <NA>         5         6         5          4 96462106
## 5      Venezuela, RB     <NA>         1         3         1          4 28515829
## 6 Vatican (Holy See) Holy See         1         3         4          2      825
##   pop_rank pop_year gdp_md gdp_year                    economy
## 1       14     2019  21440     2019    5. Emerging region: G20
## 2       14     2019  23309     2019  7. Least developed region
## 3       15     2019  22581     2019  7. Least developed region
## 4       16     2019 261921     2019    5. Emerging region: G20
## 5       15     2019 482359     2014    5. Emerging region: G20
## 6        2     2019    -99     2019 2. Developed region: nonG7
##                income_grp fips_10 iso_a2 iso_a2_eh iso_a3 iso_a3_eh iso_n3
## 1           5. Low income      ZI     ZW        ZW    ZWE       ZWE    716
## 2  4. Lower middle income      ZA     ZM        ZM    ZMB       ZMB    894
## 3  4. Lower middle income      YM     YE        YE    YEM       YEM    887
## 4  4. Lower middle income      VM     VN        VN    VNM       VNM    704
## 5  3. Upper middle income      VE     VE        VE    VEN       VEN    862
## 6 2. High income: nonOECD      VT     VA        VA    VAT       VAT    336
##   iso_n3_eh un_a3 wb_a2 wb_a3   woe_id woe_id_eh                   woe_note
## 1       716   716    ZW   ZWE 23425004  23425004 Exact WOE match as country
## 2       894   894    ZM   ZMB 23425003  23425003 Exact WOE match as country
## 3       887   887    RY   YEM 23425002  23425002 Exact WOE match as country
## 4       704   704    VN   VNM 23424984  23424984 Exact WOE match as country
## 5       862   862    VE   VEN 23424982  23424982 Exact WOE match as country
## 6       336   336   -99   -99 23424986  23424986 Exact WOE match as country
##   adm0_iso adm0_diff adm0_tlc adm0_a3_us adm0_a3_fr adm0_a3_ru adm0_a3_es
## 1      ZWE      <NA>      ZWE        ZWE        ZWE        ZWE        ZWE
## 2      ZMB      <NA>      ZMB        ZMB        ZMB        ZMB        ZMB
## 3      YEM      <NA>      YEM        YEM        YEM        YEM        YEM
## 4      VNM      <NA>      VNM        VNM        VNM        VNM        VNM
## 5      VEN      <NA>      VEN        VEN        VEN        VEN        VEN
## 6      VAT      <NA>      VAT        VAT        VAT        VAT        VAT
##   adm0_a3_cn adm0_a3_tw adm0_a3_in adm0_a3_np adm0_a3_pk adm0_a3_de adm0_a3_gb
## 1        ZWE        ZWE        ZWE        ZWE        ZWE        ZWE        ZWE
## 2        ZMB        ZMB        ZMB        ZMB        ZMB        ZMB        ZMB
## 3        YEM        YEM        YEM        YEM        YEM        YEM        YEM
## 4        VNM        VNM        VNM        VNM        VNM        VNM        VNM
## 5        VEN        VEN        VEN        VEN        VEN        VEN        VEN
## 6        VAT        VAT        VAT        VAT        VAT        VAT        VAT
##   adm0_a3_br adm0_a3_il adm0_a3_ps adm0_a3_sa adm0_a3_eg adm0_a3_ma adm0_a3_pt
## 1        ZWE        ZWE        ZWE        ZWE        ZWE        ZWE        ZWE
## 2        ZMB        ZMB        ZMB        ZMB        ZMB        ZMB        ZMB
## 3        YEM        YEM        YEM        YEM        YEM        YEM        YEM
## 4        VNM        VNM        VNM        VNM        VNM        VNM        VNM
## 5        VEN        VEN        VEN        VEN        VEN        VEN        VEN
## 6        VAT        VAT        VAT        VAT        VAT        VAT        VAT
##   adm0_a3_ar adm0_a3_jp adm0_a3_ko adm0_a3_vn adm0_a3_tr adm0_a3_id adm0_a3_pl
## 1        ZWE        ZWE        ZWE        ZWE        ZWE        ZWE        ZWE
## 2        ZMB        ZMB        ZMB        ZMB        ZMB        ZMB        ZMB
## 3        YEM        YEM        YEM        YEM        YEM        YEM        YEM
## 4        VNM        VNM        VNM        VNM        VNM        VNM        VNM
## 5        VEN        VEN        VEN        VEN        VEN        VEN        VEN
## 6        VAT        VAT        VAT        VAT        VAT        VAT        VAT
##   adm0_a3_gr adm0_a3_it adm0_a3_nl adm0_a3_se adm0_a3_bd adm0_a3_ua adm0_a3_un
## 1        ZWE        ZWE        ZWE        ZWE        ZWE        ZWE        -99
## 2        ZMB        ZMB        ZMB        ZMB        ZMB        ZMB        -99
## 3        YEM        YEM        YEM        YEM        YEM        YEM        -99
## 4        VNM        VNM        VNM        VNM        VNM        VNM        -99
## 5        VEN        VEN        VEN        VEN        VEN        VEN        -99
## 6        VAT        VAT        VAT        VAT        VAT        VAT        -99
##   adm0_a3_wb     continent region_un          subregion
## 1        -99        Africa    Africa     Eastern Africa
## 2        -99        Africa    Africa     Eastern Africa
## 3        -99          Asia      Asia       Western Asia
## 4        -99          Asia      Asia South-Eastern Asia
## 5        -99 South America  Americas      South America
## 6        -99        Europe    Europe    Southern Europe
##                    region_wb name_len long_len abbrev_len tiny homepart
## 1         Sub-Saharan Africa        8        8          5  -99        1
## 2         Sub-Saharan Africa        6        6          6  -99        1
## 3 Middle East & North Africa        5        5          4  -99        1
## 4        East Asia & Pacific        7        7          5    2        1
## 5  Latin America & Caribbean        9        9          4  -99        1
## 6      Europe & Central Asia        7        7          4    4        1
##   min_zoom min_label max_label   label_x    label_y      ne_id wikidataid
## 1        0       2.5       8.0  29.92544 -18.911640 1159321441       Q954
## 2        0       3.0       8.0  26.39530 -14.660804 1159321439       Q953
## 3        0       3.0       8.0  45.87438  15.328226 1159321425       Q805
## 4        0       2.0       7.0 105.38729  21.715416 1159321417       Q881
## 5        0       2.5       7.5 -64.59938   7.182476 1159321411       Q717
## 6        0       5.0      10.0  12.45342  41.903323 1159321407       Q237
##     name_ar       name_bn      name_de      name_en             name_es
## 1  زيمبابوي      জিম্বাবুয়ে     Simbabwe     Zimbabwe            Zimbabue
## 2    زامبيا       জাম্বিয়া       Sambia       Zambia              Zambia
## 3     اليمن        ইয়েমেন        Jemen        Yemen               Yemen
## 4    فيتنام      ভিয়েতনাম      Vietnam      Vietnam             Vietnam
## 5   فنزويلا     ভেনেজুয়েলা    Venezuela    Venezuela           Venezuela
## 6 الفاتيكان ভ্যাটিকান সিটি Vatikanstadt Vatican City Ciudad del Vaticano
##    name_fa         name_fr    name_el       name_he   name_hi   name_hu
## 1 زیمبابوه        Zimbabwe Ζιμπάμπουε      זימבבואה   ज़िम्बाब्वे  Zimbabwe
## 2   زامبیا          Zambie     Ζάμπια         זמביה   ज़ाम्बिया    Zambia
## 3      یمن           Yémen     Υεμένη          תימן       यमन     Jemen
## 4   ویتنام        Viêt Nam    Βιετνάμ       וייטנאם   वियतनाम   Vietnám
## 5  ونزوئلا       Venezuela Βενεζουέλα       ונצואלה    वेनेज़ुएला Venezuela
## 6  واتیکان Cité du Vatican   Βατικανό קריית הוותיקן वैटिकन नगर   Vatikán
##     name_id            name_it    name_ja     name_ko      name_nl   name_pl
## 1  Zimbabwe           Zimbabwe ジンバブエ    짐바브웨     Zimbabwe  Zimbabwe
## 2    Zambia             Zambia   ザンビア      잠비아       Zambia    Zambia
## 3     Yaman              Yemen   イエメン        예멘        Jemen     Jemen
## 4   Vietnam            Vietnam   ベトナム      베트남      Vietnam   Wietnam
## 5 Venezuela          Venezuela ベネズエラ  베네수엘라    Venezuela Wenezuela
## 6   Vatikan Città del Vaticano   バチカン 바티칸 시국 Vaticaanstad   Watykan
##     name_pt   name_ru       name_sv   name_tr   name_uk    name_ur
## 1  Zimbábue  Зимбабве      Zimbabwe  Zimbabve  Зімбабве    زمبابوے
## 2    Zâmbia    Замбия        Zambia   Zambiya    Замбія     زیمبیا
## 3     Iémen     Йемен         Jemen     Yemen      Ємен        یمن
## 4  Vietname   Вьетнам       Vietnam   Vietnam   В'єтнам     ویتنام
## 5 Venezuela Венесуэла     Venezuela Venezuela Венесуела  وینیزویلا
## 6  Vaticano   Ватикан Vatikanstaten   Vatikan   Ватикан ویٹیکن سٹی
##         name_vi  name_zh name_zht      fclass_iso tlc_diff      fclass_tlc
## 1      Zimbabwe 津巴布韦   辛巴威 Admin-0 country     <NA> Admin-0 country
## 2        Zambia   赞比亚   尚比亞 Admin-0 country     <NA> Admin-0 country
## 3         Yemen     也门     葉門 Admin-0 country     <NA> Admin-0 country
## 4      Việt Nam     越南     越南 Admin-0 country     <NA> Admin-0 country
## 5     Venezuela 委内瑞拉 委內瑞拉 Admin-0 country     <NA> Admin-0 country
## 6 Thành Vatican   梵蒂冈   梵蒂岡 Admin-0 country     <NA> Admin-0 country
##   fclass_us fclass_fr fclass_ru fclass_es fclass_cn fclass_tw fclass_in
## 1      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 2      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 3      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 4      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 5      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 6      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
##   fclass_np fclass_pk fclass_de fclass_gb fclass_br fclass_il fclass_ps
## 1      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 2      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 3      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 4      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 5      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 6      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
##   fclass_sa fclass_eg fclass_ma fclass_pt fclass_ar fclass_jp fclass_ko
## 1      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 2      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 3      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 4      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 5      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 6      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
##   fclass_vn fclass_tr fclass_id fclass_pl fclass_gr fclass_it fclass_nl
## 1      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 2      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 3      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 4      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 5      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
## 6      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>      <NA>
##   fclass_se fclass_bd fclass_ua                       geometry
## 1      <NA>      <NA>      <NA> MULTIPOLYGON (((31.28789 -2...
## 2      <NA>      <NA>      <NA> MULTIPOLYGON (((30.39609 -1...
## 3      <NA>      <NA>      <NA> MULTIPOLYGON (((53.08564 16...
## 4      <NA>      <NA>      <NA> MULTIPOLYGON (((104.064 10....
## 5      <NA>      <NA>      <NA> MULTIPOLYGON (((-60.82119 9...
## 6      <NA>      <NA>      <NA> MULTIPOLYGON (((12.43916 41...
# Check the column names in both datasets
colnames(world)
##   [1] "featurecla" "scalerank"  "labelrank"  "sovereignt" "sov_a3"    
##   [6] "adm0_dif"   "level"      "type"       "tlc"        "admin"     
##  [11] "adm0_a3"    "geou_dif"   "geounit"    "gu_a3"      "su_dif"    
##  [16] "subunit"    "su_a3"      "brk_diff"   "name"       "name_long" 
##  [21] "brk_a3"     "brk_name"   "brk_group"  "abbrev"     "postal"    
##  [26] "formal_en"  "formal_fr"  "name_ciawf" "note_adm0"  "note_brk"  
##  [31] "name_sort"  "name_alt"   "mapcolor7"  "mapcolor8"  "mapcolor9" 
##  [36] "mapcolor13" "pop_est"    "pop_rank"   "pop_year"   "gdp_md"    
##  [41] "gdp_year"   "economy"    "income_grp" "fips_10"    "iso_a2"    
##  [46] "iso_a2_eh"  "iso_a3"     "iso_a3_eh"  "iso_n3"     "iso_n3_eh" 
##  [51] "un_a3"      "wb_a2"      "wb_a3"      "woe_id"     "woe_id_eh" 
##  [56] "woe_note"   "adm0_iso"   "adm0_diff"  "adm0_tlc"   "adm0_a3_us"
##  [61] "adm0_a3_fr" "adm0_a3_ru" "adm0_a3_es" "adm0_a3_cn" "adm0_a3_tw"
##  [66] "adm0_a3_in" "adm0_a3_np" "adm0_a3_pk" "adm0_a3_de" "adm0_a3_gb"
##  [71] "adm0_a3_br" "adm0_a3_il" "adm0_a3_ps" "adm0_a3_sa" "adm0_a3_eg"
##  [76] "adm0_a3_ma" "adm0_a3_pt" "adm0_a3_ar" "adm0_a3_jp" "adm0_a3_ko"
##  [81] "adm0_a3_vn" "adm0_a3_tr" "adm0_a3_id" "adm0_a3_pl" "adm0_a3_gr"
##  [86] "adm0_a3_it" "adm0_a3_nl" "adm0_a3_se" "adm0_a3_bd" "adm0_a3_ua"
##  [91] "adm0_a3_un" "adm0_a3_wb" "continent"  "region_un"  "subregion" 
##  [96] "region_wb"  "name_len"   "long_len"   "abbrev_len" "tiny"      
## [101] "homepart"   "min_zoom"   "min_label"  "max_label"  "label_x"   
## [106] "label_y"    "ne_id"      "wikidataid" "name_ar"    "name_bn"   
## [111] "name_de"    "name_en"    "name_es"    "name_fa"    "name_fr"   
## [116] "name_el"    "name_he"    "name_hi"    "name_hu"    "name_id"   
## [121] "name_it"    "name_ja"    "name_ko"    "name_nl"    "name_pl"   
## [126] "name_pt"    "name_ru"    "name_sv"    "name_tr"    "name_uk"   
## [131] "name_ur"    "name_vi"    "name_zh"    "name_zht"   "fclass_iso"
## [136] "tlc_diff"   "fclass_tlc" "fclass_us"  "fclass_fr"  "fclass_ru" 
## [141] "fclass_es"  "fclass_cn"  "fclass_tw"  "fclass_in"  "fclass_np" 
## [146] "fclass_pk"  "fclass_de"  "fclass_gb"  "fclass_br"  "fclass_il" 
## [151] "fclass_ps"  "fclass_sa"  "fclass_eg"  "fclass_ma"  "fclass_pt" 
## [156] "fclass_ar"  "fclass_jp"  "fclass_ko"  "fclass_vn"  "fclass_tr" 
## [161] "fclass_id"  "fclass_pl"  "fclass_gr"  "fclass_it"  "fclass_nl" 
## [166] "fclass_se"  "fclass_bd"  "fclass_ua"  "geometry"
colnames(countries_data)
## [1] "Entity"     "Code"       "Year"       "Prevalence"
# Unique country names in the world dataset
unique(world$iso_a3)
##   [1] "ZWE" "ZMB" "YEM" "VNM" "VEN" "VAT" "VUT" "UZB" "URY" "FSM" "MHL" "MNP"
##  [13] "VIR" "GUM" "ASM" "PRI" "USA" "SGS" "IOT" "SHN" "PCN" "AIA" "FLK" "CYM"
##  [25] "BMU" "VGB" "TCA" "MSR" "JEY" "GGY" "IMN" "GBR" "ARE" "UKR" "UGA" "TKM"
##  [37] "TUR" "TUN" "TTO" "TON" "TGO" "TLS" "THA" "TZA" "TJK" "TWN" "SYR" "CHE"
##  [49] "SWE" "SWZ" "SUR" "SSD" "SDN" "LKA" "ESP" "KOR" "ZAF" "SOM" "-99" "SLB"
##  [61] "SVK" "SVN" "SGP" "SLE" "SYC" "SRB" "SEN" "SAU" "STP" "SMR" "WSM" "VCT"
##  [73] "LCA" "KNA" "RWA" "RUS" "ROU" "QAT" "PRT" "POL" "PHL" "PER" "PRY" "PNG"
##  [85] "PAN" "PLW" "PAK" "OMN" "PRK" "NGA" "NER" "NIC" "NZL" "NIU" "COK" "NLD"
##  [97] "ABW" "CUW" "NPL" "NRU" "NAM" "MOZ" "MAR" "ESH" "MNE" "MNG" "MDA" "MCO"
## [109] "MEX" "MUS" "MRT" "MLT" "MLI" "MDV" "MYS" "MWI" "MDG" "MKD" "LUX" "LTU"
## [121] "LIE" "LBY" "LBR" "LSO" "LBN" "LVA" "LAO" "KGZ" "KWT" "KIR" "KEN" "KAZ"
## [133] "JOR" "JPN" "JAM" "ITA" "ISR" "PSE" "IRL" "IRQ" "IRN" "IDN" "IND" "ISL"
## [145] "HUN" "HND" "HTI" "GUY" "GNB" "GIN" "GTM" "GRD" "GRC" "GHA" "DEU" "GEO"
## [157] "GMB" "GAB" "SPM" "WLF" "MAF" "BLM" "PYF" "NCL" "ATF" "ALA" "FIN" "FJI"
## [169] "ETH" "EST" "ERI" "GNQ" "SLV" "EGY" "ECU" "DOM" "DMA" "DJI" "GRL" "FRO"
## [181] "DNK" "CZE" "CYP" "CUB" "HRV" "CIV" "CRI" "COD" "COG" "COM" "COL" "CHN"
## [193] "MAC" "HKG" "CHL" "TCD" "CAF" "CPV" "CAN" "CMR" "KHM" "MMR" "BDI" "BFA"
## [205] "BGR" "BRN" "BRA" "BWA" "BIH" "BOL" "BTN" "BEN" "BLZ" "BEL" "BLR" "BRB"
## [217] "BGD" "BHR" "BHS" "AZE" "AUT" "AUS" "HMD" "NFK" "ARM" "ARG" "ATG" "AGO"
## [229] "AND" "DZA" "ALB" "AFG" "ATA" "SXM" "TUV"
# Unique country names in the countries_data dataset
unique(countries_data$Code)
##   [1] "AFG" "ALB" "DZA" "AND" "ARG" "ARM" "AUS" "AUT" "AZE" "BHS" "BHR" "BGD"
##  [13] "BRB" "BLR" "BEL" "BLZ" "BEN" "BOL" "BIH" "BWA" "BRA" "BRN" "BGR" "BFA"
##  [25] "BDI" "KHM" "CMR" "CAN" "CPV" "TCD" "CHL" "CHN" "COL" "COM" "COG" "CRI"
##  [37] "CIV" "HRV" "CUB" "CYP" "CZE" "COD" "DNK" "DOM" "TLS" "ECU" "EGY" "SLV"
##  [49] "ERI" "EST" "SWZ" "ETH" "FJI" "FIN" "FRA" "GMB" "GEO" "DEU" "GHA" "GRC"
##  [61] "GTM" "GNB" "GUY" "HTI" "HUN" "ISL" "IND" "IDN" "IRN" "IRQ" "IRL" "ISR"
##  [73] "ITA" "JAM" "JPN" "JOR" "KAZ" "KEN" "KIR" "KWT" "KGZ" "LAO" "LVA" "LBN"
##  [85] "LSO" "LBR" "LTU" "LUX" "MDG" "MWI" "MYS" "MDV" "MLI" "MLT" "MHL" "MRT"
##  [97] "MUS" "MEX" "MDA" "MNG" "MNE" "MAR" "MOZ" "MMR" "NAM" "NRU" "NPL" "NLD"
## [109] "NZL" "NER" "NGA" "PRK" "NOR" "OMN" "PAK" "PLW" "PAN" "PNG" "PRY" "PER"
## [121] "PHL" "POL" "PRT" "QAT" "ROU" "RUS" "RWA" "WSM" "STP" "SAU" "SEN" "SRB"
## [133] "SYC" "SLE" "SGP" "SVK" "SVN" "SLB" "ZAF" "KOR" "ESP" "LKA" "SWE" "CHE"
## [145] "TZA" "THA" "TGO" "TON" "TUN" "TUR" "TKM" "TUV" "UGA" "UKR" "GBR" "USA"
## [157] "URY" "UZB" "VUT" "VNM" "YEM" "ZMB" "ZWE"
# Identify codes in countries_data but not in world
missing_in_world <- setdiff(countries_data$Code, world$iso_a3)
print(missing_in_world)
## [1] "FRA" "NOR"
# Identify codes in world but not in countries_data
missing_in_data <- setdiff(world$iso_a3, countries_data$Code)
print(missing_in_data)
##  [1] "VEN" "VAT" "FSM" "MNP" "VIR" "GUM" "ASM" "PRI" "SGS" "IOT" "SHN" "PCN"
## [13] "AIA" "FLK" "CYM" "BMU" "VGB" "TCA" "MSR" "JEY" "GGY" "IMN" "ARE" "TTO"
## [25] "TJK" "TWN" "SYR" "SUR" "SSD" "SDN" "SOM" "-99" "SMR" "VCT" "LCA" "KNA"
## [37] "NIC" "NIU" "COK" "ABW" "CUW" "ESH" "MCO" "MKD" "LIE" "LBY" "PSE" "HND"
## [49] "GIN" "GRD" "GAB" "SPM" "WLF" "MAF" "BLM" "PYF" "NCL" "ATF" "ALA" "GNQ"
## [61] "DMA" "DJI" "GRL" "FRO" "MAC" "HKG" "CAF" "BTN" "HMD" "NFK" "ATG" "AGO"
## [73] "ATA" "SXM"
# Fix missing ISO codes
fix_iso_codes <- function(world_data) {
  world_data %>%
    mutate(iso_a3 = ifelse(name == "France", "FRA", iso_a3)) %>%
    mutate(iso_a3 = ifelse(name == "Norway", "NOR", iso_a3))
}
world <- fix_iso_codes(world)

# Merge world map data with smoking data
map_data <- world %>%
  left_join(countries_data, by = c("iso_a3" = "Code"))

# Replace NA prevalence values with 0
map_data <- map_data %>%
  mutate(Prevalence = ifelse(is.na(Prevalence), 0, Prevalence))

# Inspect the merged data
str(map_data)
## Classes 'sf' and 'data.frame':   894 obs. of  172 variables:
##  $ featurecla: chr  "Admin-0 country" "Admin-0 country" "Admin-0 country" "Admin-0 country" ...
##  $ scalerank : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ labelrank : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ sovereignt: chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ sov_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_dif  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ level     : int  2 2 2 2 2 2 2 2 2 2 ...
##  $ type      : chr  "Sovereign country" "Sovereign country" "Sovereign country" "Sovereign country" ...
##  $ tlc       : chr  "1" "1" "1" "1" ...
##  $ admin     : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ adm0_a3   : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ geou_dif  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ geounit   : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ gu_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ su_dif    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ subunit   : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ su_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ brk_diff  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ name      : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ name_long : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ brk_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ brk_name  : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ brk_group : chr  NA NA NA NA ...
##  $ abbrev    : chr  "Zimb." "Zimb." "Zimb." "Zimb." ...
##  $ postal    : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ formal_en : chr  "Republic of Zimbabwe" "Republic of Zimbabwe" "Republic of Zimbabwe" "Republic of Zimbabwe" ...
##  $ formal_fr : chr  NA NA NA NA ...
##  $ name_ciawf: chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ note_adm0 : chr  NA NA NA NA ...
##  $ note_brk  : chr  NA NA NA NA ...
##  $ name_sort : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ name_alt  : chr  NA NA NA NA ...
##  $ mapcolor7 : int  1 1 1 1 1 5 5 5 5 5 ...
##  $ mapcolor8 : int  5 5 5 5 5 8 8 8 8 8 ...
##  $ mapcolor9 : int  3 3 3 3 3 5 5 5 5 5 ...
##  $ mapcolor13: int  9 9 9 9 9 13 13 13 13 13 ...
##  $ pop_est   : num  14645468 14645468 14645468 14645468 14645468 ...
##  $ pop_rank  : int  14 14 14 14 14 14 14 14 14 14 ...
##  $ pop_year  : int  2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 ...
##  $ gdp_md    : int  21440 21440 21440 21440 21440 23309 23309 23309 23309 23309 ...
##  $ gdp_year  : int  2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 ...
##  $ economy   : chr  "5. Emerging region: G20" "5. Emerging region: G20" "5. Emerging region: G20" "5. Emerging region: G20" ...
##  $ income_grp: chr  "5. Low income" "5. Low income" "5. Low income" "5. Low income" ...
##  $ fips_10   : chr  "ZI" "ZI" "ZI" "ZI" ...
##  $ iso_a2    : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ iso_a2_eh : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ iso_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ iso_a3_eh : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ iso_n3    : chr  "716" "716" "716" "716" ...
##  $ iso_n3_eh : chr  "716" "716" "716" "716" ...
##  $ un_a3     : chr  "716" "716" "716" "716" ...
##  $ wb_a2     : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ wb_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ woe_id    : int  23425004 23425004 23425004 23425004 23425004 23425003 23425003 23425003 23425003 23425003 ...
##  $ woe_id_eh : int  23425004 23425004 23425004 23425004 23425004 23425003 23425003 23425003 23425003 23425003 ...
##  $ woe_note  : chr  "Exact WOE match as country" "Exact WOE match as country" "Exact WOE match as country" "Exact WOE match as country" ...
##  $ adm0_iso  : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_diff : chr  NA NA NA NA ...
##  $ adm0_tlc  : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_us: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_fr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ru: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_es: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_cn: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_tw: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_in: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_np: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pk: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_de: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_gb: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_br: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_il: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ps: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_sa: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_eg: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ma: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pt: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ar: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_jp: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ko: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_vn: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_tr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_id: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pl: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_gr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_it: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_nl: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_se: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_bd: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ua: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_un: int  -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 ...
##  $ adm0_a3_wb: int  -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 ...
##  $ continent : chr  "Africa" "Africa" "Africa" "Africa" ...
##  $ region_un : chr  "Africa" "Africa" "Africa" "Africa" ...
##  $ subregion : chr  "Eastern Africa" "Eastern Africa" "Eastern Africa" "Eastern Africa" ...
##  $ region_wb : chr  "Sub-Saharan Africa" "Sub-Saharan Africa" "Sub-Saharan Africa" "Sub-Saharan Africa" ...
##  $ name_len  : int  8 8 8 8 8 6 6 6 6 6 ...
##  $ long_len  : int  8 8 8 8 8 6 6 6 6 6 ...
##  $ abbrev_len: int  5 5 5 5 5 6 6 6 6 6 ...
##   [list output truncated]
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr [1:171] "featurecla" "scalerank" "labelrank" "sovereignt" ...
summary(map_data)
##   featurecla          scalerank      labelrank      sovereignt       
##  Length:894         Min.   :1.00   Min.   :2.000   Length:894        
##  Class :character   1st Qu.:1.00   1st Qu.:3.000   Class :character  
##  Mode  :character   Median :1.00   Median :4.000   Mode  :character  
##                     Mean   :1.73   Mean   :3.868                     
##                     3rd Qu.:3.00   3rd Qu.:5.000                     
##                     Max.   :6.00   Max.   :7.000                     
##                                                                      
##     sov_a3             adm0_dif         level           type          
##  Length:894         Min.   :0.000   Min.   :1.000   Length:894        
##  Class :character   1st Qu.:0.000   1st Qu.:2.000   Class :character  
##  Mode  :character   Median :0.000   Median :2.000   Mode  :character  
##                     Mean   :0.113   Mean   :1.989                     
##                     3rd Qu.:0.000   3rd Qu.:2.000                     
##                     Max.   :1.000   Max.   :2.000                     
##                                                                       
##      tlc               admin             adm0_a3             geou_dif
##  Length:894         Length:894         Length:894         Min.   :0  
##  Class :character   Class :character   Class :character   1st Qu.:0  
##  Mode  :character   Mode  :character   Mode  :character   Median :0  
##                                                           Mean   :0  
##                                                           3rd Qu.:0  
##                                                           Max.   :0  
##                                                                      
##    geounit             gu_a3               su_dif          subunit         
##  Length:894         Length:894         Min.   :0.00000   Length:894        
##  Class :character   Class :character   1st Qu.:0.00000   Class :character  
##  Mode  :character   Mode  :character   Median :0.00000   Mode  :character  
##                                        Mean   :0.01119                     
##                                        3rd Qu.:0.00000                     
##                                        Max.   :1.00000                     
##                                                                            
##     su_a3              brk_diff           name            name_long        
##  Length:894         Min.   :0.00000   Length:894         Length:894        
##  Class :character   1st Qu.:0.00000   Class :character   Class :character  
##  Mode  :character   Median :0.00000   Mode  :character   Mode  :character  
##                     Mean   :0.01007                                        
##                     3rd Qu.:0.00000                                        
##                     Max.   :1.00000                                        
##                                                                            
##     brk_a3            brk_name          brk_group            abbrev         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     postal           formal_en          formal_fr          name_ciawf       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   note_adm0           note_brk          name_sort           name_alt        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    mapcolor7       mapcolor8       mapcolor9       mapcolor13     
##  Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :-99.000  
##  1st Qu.:2.000   1st Qu.:2.000   1st Qu.:2.000   1st Qu.:  3.000  
##  Median :3.000   Median :3.000   Median :3.000   Median :  6.000  
##  Mean   :3.177   Mean   :3.459   Mean   :3.752   Mean   :  6.208  
##  3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:  9.000  
##  Max.   :7.000   Max.   :8.000   Max.   :9.000   Max.   : 13.000  
##                                                                   
##     pop_est             pop_rank        pop_year        gdp_md        
##  Min.   :0.000e+00   Min.   : 1.00   Min.   :2011   Min.   :     -99  
##  1st Qu.:1.962e+06   1st Qu.:12.00   1st Qu.:2019   1st Qu.:   10354  
##  Median :8.776e+06   Median :13.00   Median :2019   Median :   40000  
##  Mean   :4.177e+07   Mean   :12.98   Mean   :2019   Mean   :  479934  
##  3rd Qu.:3.037e+07   3rd Qu.:15.00   3rd Qu.:2019   3rd Qu.:  250529  
##  Max.   :1.398e+09   Max.   :18.00   Max.   :2020   Max.   :21433226  
##                                                                       
##     gdp_year      economy           income_grp          fips_10         
##  Min.   :2003   Length:894         Length:894         Length:894        
##  1st Qu.:2019   Class :character   Class :character   Class :character  
##  Median :2019   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :2019                                                           
##  3rd Qu.:2019                                                           
##  Max.   :2019                                                           
##                                                                         
##     iso_a2           iso_a2_eh            iso_a3           iso_a3_eh        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     iso_n3           iso_n3_eh            un_a3              wb_a2          
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     wb_a3               woe_id           woe_id_eh          woe_note        
##  Length:894         Min.   :     -99   Min.   :     -99   Length:894        
##  Class :character   1st Qu.:23424781   1st Qu.:23424794   Class :character  
##  Mode  :character   Median :23424863   Median :23424871   Mode  :character  
##                     Mean   :22202644   Mean   :23423325                     
##                     3rd Qu.:23424929   3rd Qu.:23424933                     
##                     Max.   :56042305   Max.   :56042305                     
##                                                                             
##    adm0_iso          adm0_diff           adm0_tlc          adm0_a3_us       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_fr         adm0_a3_ru         adm0_a3_es         adm0_a3_cn       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_tw         adm0_a3_in         adm0_a3_np         adm0_a3_pk       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_de         adm0_a3_gb         adm0_a3_br         adm0_a3_il       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_ps         adm0_a3_sa         adm0_a3_eg         adm0_a3_ma       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_pt         adm0_a3_ar         adm0_a3_jp         adm0_a3_ko       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_vn         adm0_a3_tr         adm0_a3_id         adm0_a3_pl       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_gr         adm0_a3_it         adm0_a3_nl         adm0_a3_se       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_bd         adm0_a3_ua          adm0_a3_un    adm0_a3_wb 
##  Length:894         Length:894         Min.   :-99   Min.   :-99  
##  Class :character   Class :character   1st Qu.:-99   1st Qu.:-99  
##  Mode  :character   Mode  :character   Median :-99   Median :-99  
##                                        Mean   :-99   Mean   :-99  
##                                        3rd Qu.:-99   3rd Qu.:-99  
##                                        Max.   :-99   Max.   :-99  
##                                                                   
##   continent          region_un          subregion          region_wb        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     name_len         long_len        abbrev_len          tiny      
##  Min.   : 4.000   Min.   : 4.000   Min.   : 3.000   Min.   :-99.0  
##  1st Qu.: 6.000   1st Qu.: 6.000   1st Qu.: 4.000   1st Qu.:-99.0  
##  Median : 7.000   Median : 7.000   Median : 4.000   Median :-99.0  
##  Mean   : 8.102   Mean   : 8.935   Mean   : 4.736   Mean   :-82.1  
##  3rd Qu.:10.000   3rd Qu.:10.000   3rd Qu.: 5.000   3rd Qu.:-99.0  
##  Max.   :25.000   Max.   :35.000   Max.   :13.000   Max.   :  6.0  
##                                                                    
##     homepart          min_zoom         min_label       max_label     
##  Min.   :-99.000   Min.   :0.00000   Min.   :1.700   Min.   : 5.200  
##  1st Qu.:  1.000   1st Qu.:0.00000   1st Qu.:2.700   1st Qu.: 7.000  
##  Median :  1.000   Median :0.00000   Median :3.000   Median : 8.000  
##  Mean   : -3.586   Mean   :0.02315   Mean   :3.392   Mean   : 8.268  
##  3rd Qu.:  1.000   3rd Qu.:0.00000   3rd Qu.:4.000   3rd Qu.: 9.000  
##  Max.   :  1.000   Max.   :7.00000   Max.   :6.500   Max.   :11.000  
##                                                                      
##     label_x            label_y           ne_id            wikidataid       
##  Min.   :-178.137   Min.   :-79.84   Min.   :1.159e+09   Length:894        
##  1st Qu.:  -7.187   1st Qu.:  1.48   1st Qu.:1.159e+09   Class :character  
##  Median :  21.726   Median : 18.69   Median :1.159e+09   Mode  :character  
##  Mean   :  21.259   Mean   : 18.70   Mean   :1.159e+09                     
##  3rd Qu.:  51.144   3rd Qu.: 40.40   3rd Qu.:1.159e+09                     
##  Max.   : 179.210   Max.   : 74.32   Max.   :1.159e+09                     
##                                                                            
##    name_ar            name_bn            name_de            name_en         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_es            name_fa            name_fr            name_el         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_he            name_hi            name_hu            name_id         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_it            name_ja            name_ko            name_nl         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_pl            name_pt            name_ru            name_sv         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_tr            name_uk            name_ur            name_vi         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_zh            name_zht          fclass_iso          tlc_diff        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_tlc         fclass_us          fclass_fr          fclass_ru        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_es          fclass_cn          fclass_tw          fclass_in        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_np          fclass_pk          fclass_de          fclass_gb        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_br          fclass_il          fclass_ps          fclass_sa        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_eg          fclass_ma          fclass_pt          fclass_ar        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_jp          fclass_ko          fclass_vn          fclass_tr        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_id          fclass_pl          fclass_gr          fclass_it        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_nl          fclass_se          fclass_bd          fclass_ua        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     Entity               Year        Prevalence             geometry  
##  Length:894         Min.   :2000   Min.   : 0.00   MULTIPOLYGON :894  
##  Class :character   1st Qu.:2005   1st Qu.:13.00   epsg:4326    :  0  
##  Mode  :character   Median :2010   Median :23.30   +proj=long...:  0  
##                     Mean   :2010   Mean   :22.59                      
##                     3rd Qu.:2015   3rd Qu.:31.50                      
##                     Max.   :2020   Max.   :68.50                      
##                     NA's   :79
#get rid of geom
plot_data <- map_data %>%
  st_set_geometry(NULL)  # Drop geometry for Plotly compatibility


#interactive map 
# Determine min and max prevalence values
min_prevalence <- min(plot_data$Prevalence, na.rm = TRUE)
max_prevalence <- max(plot_data$Prevalence, na.rm = TRUE)


#removing duplicate 
# Check structure of the dataset
str(plot_data)
## 'data.frame':    894 obs. of  171 variables:
##  $ featurecla: chr  "Admin-0 country" "Admin-0 country" "Admin-0 country" "Admin-0 country" ...
##  $ scalerank : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ labelrank : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ sovereignt: chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ sov_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_dif  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ level     : int  2 2 2 2 2 2 2 2 2 2 ...
##  $ type      : chr  "Sovereign country" "Sovereign country" "Sovereign country" "Sovereign country" ...
##  $ tlc       : chr  "1" "1" "1" "1" ...
##  $ admin     : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ adm0_a3   : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ geou_dif  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ geounit   : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ gu_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ su_dif    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ subunit   : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ su_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ brk_diff  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ name      : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ name_long : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ brk_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ brk_name  : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ brk_group : chr  NA NA NA NA ...
##  $ abbrev    : chr  "Zimb." "Zimb." "Zimb." "Zimb." ...
##  $ postal    : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ formal_en : chr  "Republic of Zimbabwe" "Republic of Zimbabwe" "Republic of Zimbabwe" "Republic of Zimbabwe" ...
##  $ formal_fr : chr  NA NA NA NA ...
##  $ name_ciawf: chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ note_adm0 : chr  NA NA NA NA ...
##  $ note_brk  : chr  NA NA NA NA ...
##  $ name_sort : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ name_alt  : chr  NA NA NA NA ...
##  $ mapcolor7 : int  1 1 1 1 1 5 5 5 5 5 ...
##  $ mapcolor8 : int  5 5 5 5 5 8 8 8 8 8 ...
##  $ mapcolor9 : int  3 3 3 3 3 5 5 5 5 5 ...
##  $ mapcolor13: int  9 9 9 9 9 13 13 13 13 13 ...
##  $ pop_est   : num  14645468 14645468 14645468 14645468 14645468 ...
##  $ pop_rank  : int  14 14 14 14 14 14 14 14 14 14 ...
##  $ pop_year  : int  2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 ...
##  $ gdp_md    : int  21440 21440 21440 21440 21440 23309 23309 23309 23309 23309 ...
##  $ gdp_year  : int  2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 ...
##  $ economy   : chr  "5. Emerging region: G20" "5. Emerging region: G20" "5. Emerging region: G20" "5. Emerging region: G20" ...
##  $ income_grp: chr  "5. Low income" "5. Low income" "5. Low income" "5. Low income" ...
##  $ fips_10   : chr  "ZI" "ZI" "ZI" "ZI" ...
##  $ iso_a2    : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ iso_a2_eh : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ iso_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ iso_a3_eh : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ iso_n3    : chr  "716" "716" "716" "716" ...
##  $ iso_n3_eh : chr  "716" "716" "716" "716" ...
##  $ un_a3     : chr  "716" "716" "716" "716" ...
##  $ wb_a2     : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ wb_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ woe_id    : int  23425004 23425004 23425004 23425004 23425004 23425003 23425003 23425003 23425003 23425003 ...
##  $ woe_id_eh : int  23425004 23425004 23425004 23425004 23425004 23425003 23425003 23425003 23425003 23425003 ...
##  $ woe_note  : chr  "Exact WOE match as country" "Exact WOE match as country" "Exact WOE match as country" "Exact WOE match as country" ...
##  $ adm0_iso  : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_diff : chr  NA NA NA NA ...
##  $ adm0_tlc  : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_us: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_fr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ru: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_es: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_cn: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_tw: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_in: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_np: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pk: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_de: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_gb: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_br: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_il: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ps: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_sa: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_eg: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ma: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pt: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ar: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_jp: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ko: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_vn: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_tr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_id: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pl: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_gr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_it: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_nl: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_se: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_bd: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ua: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_un: int  -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 ...
##  $ adm0_a3_wb: int  -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 ...
##  $ continent : chr  "Africa" "Africa" "Africa" "Africa" ...
##  $ region_un : chr  "Africa" "Africa" "Africa" "Africa" ...
##  $ subregion : chr  "Eastern Africa" "Eastern Africa" "Eastern Africa" "Eastern Africa" ...
##  $ region_wb : chr  "Sub-Saharan Africa" "Sub-Saharan Africa" "Sub-Saharan Africa" "Sub-Saharan Africa" ...
##  $ name_len  : int  8 8 8 8 8 6 6 6 6 6 ...
##  $ long_len  : int  8 8 8 8 8 6 6 6 6 6 ...
##  $ abbrev_len: int  5 5 5 5 5 6 6 6 6 6 ...
##   [list output truncated]
# Ensure there are no duplicates
duplicates <- plot_data %>%
  group_by(iso_a3, Year) %>%
  filter(n() > 1)
print(duplicates)
## # A tibble: 6 × 171
## # Groups:   iso_a3, Year [1]
##   featurecla    scalerank labelrank sovereignt sov_a3 adm0_dif level type  tlc  
##   <chr>             <int>     <int> <chr>      <chr>     <int> <int> <chr> <chr>
## 1 Admin-0 coun…         1         5 Somaliland SOL           0     2 Sove… 1    
## 2 Admin-0 coun…         1         6 Kosovo     KOS           0     2 Disp… 1    
## 3 Admin-0 coun…         1         6 Northern … CYN           0     2 Sove… 1    
## 4 Admin-0 coun…         5         5 Australia  AU1           1     2 Depe… 1    
## 5 Admin-0 coun…         5         5 Australia  AU1           1     2 Depe… 1    
## 6 Admin-0 coun…         1         5 Kashmir    KAS           0     2 Inde… <NA> 
## # ℹ 162 more variables: admin <chr>, adm0_a3 <chr>, geou_dif <int>,
## #   geounit <chr>, gu_a3 <chr>, su_dif <int>, subunit <chr>, su_a3 <chr>,
## #   brk_diff <int>, name <chr>, name_long <chr>, brk_a3 <chr>, brk_name <chr>,
## #   brk_group <chr>, abbrev <chr>, postal <chr>, formal_en <chr>,
## #   formal_fr <chr>, name_ciawf <chr>, note_adm0 <chr>, note_brk <chr>,
## #   name_sort <chr>, name_alt <chr>, mapcolor7 <int>, mapcolor8 <int>,
## #   mapcolor9 <int>, mapcolor13 <int>, pop_est <dbl>, pop_rank <int>, …
# Check for missing or incorrect values
summary(plot_data)
##   featurecla          scalerank      labelrank      sovereignt       
##  Length:894         Min.   :1.00   Min.   :2.000   Length:894        
##  Class :character   1st Qu.:1.00   1st Qu.:3.000   Class :character  
##  Mode  :character   Median :1.00   Median :4.000   Mode  :character  
##                     Mean   :1.73   Mean   :3.868                     
##                     3rd Qu.:3.00   3rd Qu.:5.000                     
##                     Max.   :6.00   Max.   :7.000                     
##                                                                      
##     sov_a3             adm0_dif         level           type          
##  Length:894         Min.   :0.000   Min.   :1.000   Length:894        
##  Class :character   1st Qu.:0.000   1st Qu.:2.000   Class :character  
##  Mode  :character   Median :0.000   Median :2.000   Mode  :character  
##                     Mean   :0.113   Mean   :1.989                     
##                     3rd Qu.:0.000   3rd Qu.:2.000                     
##                     Max.   :1.000   Max.   :2.000                     
##                                                                       
##      tlc               admin             adm0_a3             geou_dif
##  Length:894         Length:894         Length:894         Min.   :0  
##  Class :character   Class :character   Class :character   1st Qu.:0  
##  Mode  :character   Mode  :character   Mode  :character   Median :0  
##                                                           Mean   :0  
##                                                           3rd Qu.:0  
##                                                           Max.   :0  
##                                                                      
##    geounit             gu_a3               su_dif          subunit         
##  Length:894         Length:894         Min.   :0.00000   Length:894        
##  Class :character   Class :character   1st Qu.:0.00000   Class :character  
##  Mode  :character   Mode  :character   Median :0.00000   Mode  :character  
##                                        Mean   :0.01119                     
##                                        3rd Qu.:0.00000                     
##                                        Max.   :1.00000                     
##                                                                            
##     su_a3              brk_diff           name            name_long        
##  Length:894         Min.   :0.00000   Length:894         Length:894        
##  Class :character   1st Qu.:0.00000   Class :character   Class :character  
##  Mode  :character   Median :0.00000   Mode  :character   Mode  :character  
##                     Mean   :0.01007                                        
##                     3rd Qu.:0.00000                                        
##                     Max.   :1.00000                                        
##                                                                            
##     brk_a3            brk_name          brk_group            abbrev         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     postal           formal_en          formal_fr          name_ciawf       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   note_adm0           note_brk          name_sort           name_alt        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    mapcolor7       mapcolor8       mapcolor9       mapcolor13     
##  Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :-99.000  
##  1st Qu.:2.000   1st Qu.:2.000   1st Qu.:2.000   1st Qu.:  3.000  
##  Median :3.000   Median :3.000   Median :3.000   Median :  6.000  
##  Mean   :3.177   Mean   :3.459   Mean   :3.752   Mean   :  6.208  
##  3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:  9.000  
##  Max.   :7.000   Max.   :8.000   Max.   :9.000   Max.   : 13.000  
##                                                                   
##     pop_est             pop_rank        pop_year        gdp_md        
##  Min.   :0.000e+00   Min.   : 1.00   Min.   :2011   Min.   :     -99  
##  1st Qu.:1.962e+06   1st Qu.:12.00   1st Qu.:2019   1st Qu.:   10354  
##  Median :8.776e+06   Median :13.00   Median :2019   Median :   40000  
##  Mean   :4.177e+07   Mean   :12.98   Mean   :2019   Mean   :  479934  
##  3rd Qu.:3.037e+07   3rd Qu.:15.00   3rd Qu.:2019   3rd Qu.:  250529  
##  Max.   :1.398e+09   Max.   :18.00   Max.   :2020   Max.   :21433226  
##                                                                       
##     gdp_year      economy           income_grp          fips_10         
##  Min.   :2003   Length:894         Length:894         Length:894        
##  1st Qu.:2019   Class :character   Class :character   Class :character  
##  Median :2019   Mode  :character   Mode  :character   Mode  :character  
##  Mean   :2019                                                           
##  3rd Qu.:2019                                                           
##  Max.   :2019                                                           
##                                                                         
##     iso_a2           iso_a2_eh            iso_a3           iso_a3_eh        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     iso_n3           iso_n3_eh            un_a3              wb_a2          
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     wb_a3               woe_id           woe_id_eh          woe_note        
##  Length:894         Min.   :     -99   Min.   :     -99   Length:894        
##  Class :character   1st Qu.:23424781   1st Qu.:23424794   Class :character  
##  Mode  :character   Median :23424863   Median :23424871   Mode  :character  
##                     Mean   :22202644   Mean   :23423325                     
##                     3rd Qu.:23424929   3rd Qu.:23424933                     
##                     Max.   :56042305   Max.   :56042305                     
##                                                                             
##    adm0_iso          adm0_diff           adm0_tlc          adm0_a3_us       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_fr         adm0_a3_ru         adm0_a3_es         adm0_a3_cn       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_tw         adm0_a3_in         adm0_a3_np         adm0_a3_pk       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_de         adm0_a3_gb         adm0_a3_br         adm0_a3_il       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_ps         adm0_a3_sa         adm0_a3_eg         adm0_a3_ma       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_pt         adm0_a3_ar         adm0_a3_jp         adm0_a3_ko       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_vn         adm0_a3_tr         adm0_a3_id         adm0_a3_pl       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_gr         adm0_a3_it         adm0_a3_nl         adm0_a3_se       
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   adm0_a3_bd         adm0_a3_ua          adm0_a3_un    adm0_a3_wb 
##  Length:894         Length:894         Min.   :-99   Min.   :-99  
##  Class :character   Class :character   1st Qu.:-99   1st Qu.:-99  
##  Mode  :character   Mode  :character   Median :-99   Median :-99  
##                                        Mean   :-99   Mean   :-99  
##                                        3rd Qu.:-99   3rd Qu.:-99  
##                                        Max.   :-99   Max.   :-99  
##                                                                   
##   continent          region_un          subregion          region_wb        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     name_len         long_len        abbrev_len          tiny      
##  Min.   : 4.000   Min.   : 4.000   Min.   : 3.000   Min.   :-99.0  
##  1st Qu.: 6.000   1st Qu.: 6.000   1st Qu.: 4.000   1st Qu.:-99.0  
##  Median : 7.000   Median : 7.000   Median : 4.000   Median :-99.0  
##  Mean   : 8.102   Mean   : 8.935   Mean   : 4.736   Mean   :-82.1  
##  3rd Qu.:10.000   3rd Qu.:10.000   3rd Qu.: 5.000   3rd Qu.:-99.0  
##  Max.   :25.000   Max.   :35.000   Max.   :13.000   Max.   :  6.0  
##                                                                    
##     homepart          min_zoom         min_label       max_label     
##  Min.   :-99.000   Min.   :0.00000   Min.   :1.700   Min.   : 5.200  
##  1st Qu.:  1.000   1st Qu.:0.00000   1st Qu.:2.700   1st Qu.: 7.000  
##  Median :  1.000   Median :0.00000   Median :3.000   Median : 8.000  
##  Mean   : -3.586   Mean   :0.02315   Mean   :3.392   Mean   : 8.268  
##  3rd Qu.:  1.000   3rd Qu.:0.00000   3rd Qu.:4.000   3rd Qu.: 9.000  
##  Max.   :  1.000   Max.   :7.00000   Max.   :6.500   Max.   :11.000  
##                                                                      
##     label_x            label_y           ne_id            wikidataid       
##  Min.   :-178.137   Min.   :-79.84   Min.   :1.159e+09   Length:894        
##  1st Qu.:  -7.187   1st Qu.:  1.48   1st Qu.:1.159e+09   Class :character  
##  Median :  21.726   Median : 18.69   Median :1.159e+09   Mode  :character  
##  Mean   :  21.259   Mean   : 18.70   Mean   :1.159e+09                     
##  3rd Qu.:  51.144   3rd Qu.: 40.40   3rd Qu.:1.159e+09                     
##  Max.   : 179.210   Max.   : 74.32   Max.   :1.159e+09                     
##                                                                            
##    name_ar            name_bn            name_de            name_en         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_es            name_fa            name_fr            name_el         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_he            name_hi            name_hu            name_id         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_it            name_ja            name_ko            name_nl         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_pl            name_pt            name_ru            name_sv         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_tr            name_uk            name_ur            name_vi         
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##    name_zh            name_zht          fclass_iso          tlc_diff        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_tlc         fclass_us          fclass_fr          fclass_ru        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_es          fclass_cn          fclass_tw          fclass_in        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_np          fclass_pk          fclass_de          fclass_gb        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_br          fclass_il          fclass_ps          fclass_sa        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_eg          fclass_ma          fclass_pt          fclass_ar        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_jp          fclass_ko          fclass_vn          fclass_tr        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_id          fclass_pl          fclass_gr          fclass_it        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   fclass_nl          fclass_se          fclass_bd          fclass_ua        
##  Length:894         Length:894         Length:894         Length:894        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##     Entity               Year        Prevalence   
##  Length:894         Min.   :2000   Min.   : 0.00  
##  Class :character   1st Qu.:2005   1st Qu.:13.00  
##  Mode  :character   Median :2010   Median :23.30  
##                     Mean   :2010   Mean   :22.59  
##                     3rd Qu.:2015   3rd Qu.:31.50  
##                     Max.   :2020   Max.   :68.50  
##                     NA's   :79
# Find rows in the merged map_data where Prevalence is NA
unmatched <- map_data %>%
  filter(is.na(Prevalence))

head(unmatched)  # Check if these rows correspond to Somaliland, Kosovo, etc.
## Simple feature collection with 0 features and 171 fields
## Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
## Geodetic CRS:  WGS 84
##   [1] featurecla scalerank  labelrank  sovereignt sov_a3     adm0_dif  
##   [7] level      type       tlc        admin      adm0_a3    geou_dif  
##  [13] geounit    gu_a3      su_dif     subunit    su_a3      brk_diff  
##  [19] name       name_long  brk_a3     brk_name   brk_group  abbrev    
##  [25] postal     formal_en  formal_fr  name_ciawf note_adm0  note_brk  
##  [31] name_sort  name_alt   mapcolor7  mapcolor8  mapcolor9  mapcolor13
##  [37] pop_est    pop_rank   pop_year   gdp_md     gdp_year   economy   
##  [43] income_grp fips_10    iso_a2     iso_a2_eh  iso_a3     iso_a3_eh 
##  [49] iso_n3     iso_n3_eh  un_a3      wb_a2      wb_a3      woe_id    
##  [55] woe_id_eh  woe_note   adm0_iso   adm0_diff  adm0_tlc   adm0_a3_us
##  [61] adm0_a3_fr adm0_a3_ru adm0_a3_es adm0_a3_cn adm0_a3_tw adm0_a3_in
##  [67] adm0_a3_np adm0_a3_pk adm0_a3_de adm0_a3_gb adm0_a3_br adm0_a3_il
##  [73] adm0_a3_ps adm0_a3_sa adm0_a3_eg adm0_a3_ma adm0_a3_pt adm0_a3_ar
##  [79] adm0_a3_jp adm0_a3_ko adm0_a3_vn adm0_a3_tr adm0_a3_id adm0_a3_pl
##  [85] adm0_a3_gr adm0_a3_it adm0_a3_nl adm0_a3_se adm0_a3_bd adm0_a3_ua
##  [91] adm0_a3_un adm0_a3_wb continent  region_un  subregion  region_wb 
##  [97] name_len   long_len   abbrev_len tiny       homepart   min_zoom  
## [103] min_label  max_label  label_x    label_y    ne_id      wikidataid
## [109] name_ar    name_bn    name_de    name_en    name_es    name_fa   
## [115] name_fr    name_el    name_he    name_hi    name_hu    name_id   
## [121] name_it    name_ja    name_ko    name_nl    name_pl    name_pt   
## [127] name_ru    name_sv    name_tr    name_uk    name_ur    name_vi   
## [133] name_zh    name_zht   fclass_iso tlc_diff   fclass_tlc fclass_us 
## [139] fclass_fr  fclass_ru  fclass_es  fclass_cn  fclass_tw  fclass_in 
## [145] fclass_np  fclass_pk  fclass_de  fclass_gb  fclass_br  fclass_il 
## [151] fclass_ps  fclass_sa  fclass_eg  fclass_ma  fclass_pt  fclass_ar 
## [157] fclass_jp  fclass_ko  fclass_vn  fclass_tr  fclass_id  fclass_pl 
## [163] fclass_gr  fclass_it  fclass_nl  fclass_se  fclass_bd  fclass_ua 
## [169] Entity     Year       Prevalence geometry  
## <0 rows> (or 0-length row.names)
map_data <- map_data %>%
  filter(!is.na(Prevalence))

# Find codes in world that do not match countries_data
missing_in_countries_data <- setdiff(world$iso_a3, countries_data$Code)
print(missing_in_countries_data)
##  [1] "VEN" "VAT" "FSM" "MNP" "VIR" "GUM" "ASM" "PRI" "SGS" "IOT" "SHN" "PCN"
## [13] "AIA" "FLK" "CYM" "BMU" "VGB" "TCA" "MSR" "JEY" "GGY" "IMN" "ARE" "TTO"
## [25] "TJK" "TWN" "SYR" "SUR" "SSD" "SDN" "SOM" "-99" "SMR" "VCT" "LCA" "KNA"
## [37] "NIC" "NIU" "COK" "ABW" "CUW" "ESH" "MCO" "MKD" "LIE" "LBY" "PSE" "HND"
## [49] "GIN" "GRD" "GAB" "SPM" "WLF" "MAF" "BLM" "PYF" "NCL" "ATF" "ALA" "GNQ"
## [61] "DMA" "DJI" "GRL" "FRO" "MAC" "HKG" "CAF" "BTN" "HMD" "NFK" "ATG" "AGO"
## [73] "ATA" "SXM"
map_data <- world %>%
  inner_join(countries_data, by = c("iso_a3" = "Code"))


# Check structure of the dataset
str(plot_data)
## 'data.frame':    894 obs. of  171 variables:
##  $ featurecla: chr  "Admin-0 country" "Admin-0 country" "Admin-0 country" "Admin-0 country" ...
##  $ scalerank : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ labelrank : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ sovereignt: chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ sov_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_dif  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ level     : int  2 2 2 2 2 2 2 2 2 2 ...
##  $ type      : chr  "Sovereign country" "Sovereign country" "Sovereign country" "Sovereign country" ...
##  $ tlc       : chr  "1" "1" "1" "1" ...
##  $ admin     : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ adm0_a3   : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ geou_dif  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ geounit   : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ gu_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ su_dif    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ subunit   : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ su_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ brk_diff  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ name      : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ name_long : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ brk_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ brk_name  : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ brk_group : chr  NA NA NA NA ...
##  $ abbrev    : chr  "Zimb." "Zimb." "Zimb." "Zimb." ...
##  $ postal    : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ formal_en : chr  "Republic of Zimbabwe" "Republic of Zimbabwe" "Republic of Zimbabwe" "Republic of Zimbabwe" ...
##  $ formal_fr : chr  NA NA NA NA ...
##  $ name_ciawf: chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ note_adm0 : chr  NA NA NA NA ...
##  $ note_brk  : chr  NA NA NA NA ...
##  $ name_sort : chr  "Zimbabwe" "Zimbabwe" "Zimbabwe" "Zimbabwe" ...
##  $ name_alt  : chr  NA NA NA NA ...
##  $ mapcolor7 : int  1 1 1 1 1 5 5 5 5 5 ...
##  $ mapcolor8 : int  5 5 5 5 5 8 8 8 8 8 ...
##  $ mapcolor9 : int  3 3 3 3 3 5 5 5 5 5 ...
##  $ mapcolor13: int  9 9 9 9 9 13 13 13 13 13 ...
##  $ pop_est   : num  14645468 14645468 14645468 14645468 14645468 ...
##  $ pop_rank  : int  14 14 14 14 14 14 14 14 14 14 ...
##  $ pop_year  : int  2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 ...
##  $ gdp_md    : int  21440 21440 21440 21440 21440 23309 23309 23309 23309 23309 ...
##  $ gdp_year  : int  2019 2019 2019 2019 2019 2019 2019 2019 2019 2019 ...
##  $ economy   : chr  "5. Emerging region: G20" "5. Emerging region: G20" "5. Emerging region: G20" "5. Emerging region: G20" ...
##  $ income_grp: chr  "5. Low income" "5. Low income" "5. Low income" "5. Low income" ...
##  $ fips_10   : chr  "ZI" "ZI" "ZI" "ZI" ...
##  $ iso_a2    : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ iso_a2_eh : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ iso_a3    : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ iso_a3_eh : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ iso_n3    : chr  "716" "716" "716" "716" ...
##  $ iso_n3_eh : chr  "716" "716" "716" "716" ...
##  $ un_a3     : chr  "716" "716" "716" "716" ...
##  $ wb_a2     : chr  "ZW" "ZW" "ZW" "ZW" ...
##  $ wb_a3     : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ woe_id    : int  23425004 23425004 23425004 23425004 23425004 23425003 23425003 23425003 23425003 23425003 ...
##  $ woe_id_eh : int  23425004 23425004 23425004 23425004 23425004 23425003 23425003 23425003 23425003 23425003 ...
##  $ woe_note  : chr  "Exact WOE match as country" "Exact WOE match as country" "Exact WOE match as country" "Exact WOE match as country" ...
##  $ adm0_iso  : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_diff : chr  NA NA NA NA ...
##  $ adm0_tlc  : chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_us: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_fr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ru: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_es: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_cn: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_tw: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_in: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_np: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pk: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_de: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_gb: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_br: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_il: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ps: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_sa: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_eg: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ma: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pt: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ar: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_jp: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ko: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_vn: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_tr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_id: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_pl: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_gr: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_it: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_nl: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_se: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_bd: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_ua: chr  "ZWE" "ZWE" "ZWE" "ZWE" ...
##  $ adm0_a3_un: int  -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 ...
##  $ adm0_a3_wb: int  -99 -99 -99 -99 -99 -99 -99 -99 -99 -99 ...
##  $ continent : chr  "Africa" "Africa" "Africa" "Africa" ...
##  $ region_un : chr  "Africa" "Africa" "Africa" "Africa" ...
##  $ subregion : chr  "Eastern Africa" "Eastern Africa" "Eastern Africa" "Eastern Africa" ...
##  $ region_wb : chr  "Sub-Saharan Africa" "Sub-Saharan Africa" "Sub-Saharan Africa" "Sub-Saharan Africa" ...
##  $ name_len  : int  8 8 8 8 8 6 6 6 6 6 ...
##  $ long_len  : int  8 8 8 8 8 6 6 6 6 6 ...
##  $ abbrev_len: int  5 5 5 5 5 6 6 6 6 6 ...
##   [list output truncated]
# Ensure no duplicate country-year pairs
duplicates <- plot_data %>%
  group_by(iso_a3, Year) %>%
  filter(n() > 1)
print(duplicates)  # This should be empty
## # A tibble: 6 × 171
## # Groups:   iso_a3, Year [1]
##   featurecla    scalerank labelrank sovereignt sov_a3 adm0_dif level type  tlc  
##   <chr>             <int>     <int> <chr>      <chr>     <int> <int> <chr> <chr>
## 1 Admin-0 coun…         1         5 Somaliland SOL           0     2 Sove… 1    
## 2 Admin-0 coun…         1         6 Kosovo     KOS           0     2 Disp… 1    
## 3 Admin-0 coun…         1         6 Northern … CYN           0     2 Sove… 1    
## 4 Admin-0 coun…         5         5 Australia  AU1           1     2 Depe… 1    
## 5 Admin-0 coun…         5         5 Australia  AU1           1     2 Depe… 1    
## 6 Admin-0 coun…         1         5 Kashmir    KAS           0     2 Inde… <NA> 
## # ℹ 162 more variables: admin <chr>, adm0_a3 <chr>, geou_dif <int>,
## #   geounit <chr>, gu_a3 <chr>, su_dif <int>, subunit <chr>, su_a3 <chr>,
## #   brk_diff <int>, name <chr>, name_long <chr>, brk_a3 <chr>, brk_name <chr>,
## #   brk_group <chr>, abbrev <chr>, postal <chr>, formal_en <chr>,
## #   formal_fr <chr>, name_ciawf <chr>, note_adm0 <chr>, note_brk <chr>,
## #   name_sort <chr>, name_alt <chr>, mapcolor7 <int>, mapcolor8 <int>,
## #   mapcolor9 <int>, mapcolor13 <int>, pop_est <dbl>, pop_rank <int>, …
# Check for missing prevalence values
missing_prevalence <- plot_data %>%
  filter(is.na(Prevalence))
print(missing_prevalence)  # Check if some years have missing data
##   [1] featurecla scalerank  labelrank  sovereignt sov_a3     adm0_dif  
##   [7] level      type       tlc        admin      adm0_a3    geou_dif  
##  [13] geounit    gu_a3      su_dif     subunit    su_a3      brk_diff  
##  [19] name       name_long  brk_a3     brk_name   brk_group  abbrev    
##  [25] postal     formal_en  formal_fr  name_ciawf note_adm0  note_brk  
##  [31] name_sort  name_alt   mapcolor7  mapcolor8  mapcolor9  mapcolor13
##  [37] pop_est    pop_rank   pop_year   gdp_md     gdp_year   economy   
##  [43] income_grp fips_10    iso_a2     iso_a2_eh  iso_a3     iso_a3_eh 
##  [49] iso_n3     iso_n3_eh  un_a3      wb_a2      wb_a3      woe_id    
##  [55] woe_id_eh  woe_note   adm0_iso   adm0_diff  adm0_tlc   adm0_a3_us
##  [61] adm0_a3_fr adm0_a3_ru adm0_a3_es adm0_a3_cn adm0_a3_tw adm0_a3_in
##  [67] adm0_a3_np adm0_a3_pk adm0_a3_de adm0_a3_gb adm0_a3_br adm0_a3_il
##  [73] adm0_a3_ps adm0_a3_sa adm0_a3_eg adm0_a3_ma adm0_a3_pt adm0_a3_ar
##  [79] adm0_a3_jp adm0_a3_ko adm0_a3_vn adm0_a3_tr adm0_a3_id adm0_a3_pl
##  [85] adm0_a3_gr adm0_a3_it adm0_a3_nl adm0_a3_se adm0_a3_bd adm0_a3_ua
##  [91] adm0_a3_un adm0_a3_wb continent  region_un  subregion  region_wb 
##  [97] name_len   long_len   abbrev_len tiny       homepart   min_zoom  
## [103] min_label  max_label  label_x    label_y    ne_id      wikidataid
## [109] name_ar    name_bn    name_de    name_en    name_es    name_fa   
## [115] name_fr    name_el    name_he    name_hi    name_hu    name_id   
## [121] name_it    name_ja    name_ko    name_nl    name_pl    name_pt   
## [127] name_ru    name_sv    name_tr    name_uk    name_ur    name_vi   
## [133] name_zh    name_zht   fclass_iso tlc_diff   fclass_tlc fclass_us 
## [139] fclass_fr  fclass_ru  fclass_es  fclass_cn  fclass_tw  fclass_in 
## [145] fclass_np  fclass_pk  fclass_de  fclass_gb  fclass_br  fclass_il 
## [151] fclass_ps  fclass_sa  fclass_eg  fclass_ma  fclass_pt  fclass_ar 
## [157] fclass_jp  fclass_ko  fclass_vn  fclass_tr  fclass_id  fclass_pl 
## [163] fclass_gr  fclass_it  fclass_nl  fclass_se  fclass_bd  fclass_ua 
## [169] Entity     Year       Prevalence
## <0 rows> (or 0-length row.names)
# Check the distribution of years and countries
table(plot_data$Year)
## 
## 2000 2005 2010 2015 2020 
##  163  163  163  163  163
table(plot_data$iso_a3)
## 
## -99 ABW AFG AGO AIA ALA ALB AND ARE ARG ARM ASM ATA ATF ATG AUS AUT AZE BDI BEL 
##   6   1   5   1   1   1   5   5   1   5   5   1   1   1   1   5   5   5   5   5 
## BEN BFA BGD BGR BHR BHS BIH BLM BLR BLZ BMU BOL BRA BRB BRN BTN BWA CAF CAN CHE 
##   5   5   5   5   5   5   5   1   5   5   1   5   5   5   5   1   5   1   5   5 
## CHL CHN CIV CMR COD COG COK COL COM CPV CRI CUB CUW CYM CYP CZE DEU DJI DMA DNK 
##   5   5   5   5   5   5   1   5   5   5   5   5   1   1   5   5   5   1   1   5 
## DOM DZA ECU EGY ERI ESH ESP EST ETH FIN FJI FLK FRA FRO FSM GAB GBR GEO GGY GHA 
##   5   5   5   5   5   1   5   5   5   5   5   1   5   1   1   1   5   5   1   5 
## GIN GMB GNB GNQ GRC GRD GRL GTM GUM GUY HKG HMD HND HRV HTI HUN IDN IMN IND IOT 
##   1   5   5   1   5   1   1   5   1   5   1   1   1   5   5   5   5   1   5   1 
## IRL IRN IRQ ISL ISR ITA JAM JEY JOR JPN KAZ KEN KGZ KHM KIR KNA KOR KWT LAO LBN 
##   5   5   5   5   5   5   5   1   5   5   5   5   5   5   5   1   5   5   5   5 
## LBR LBY LCA LIE LKA LSO LTU LUX LVA MAC MAF MAR MCO MDA MDG MDV MEX MHL MKD MLI 
##   5   1   1   1   5   5   5   5   5   1   1   5   1   5   5   5   5   5   1   5 
## MLT MMR MNE MNG MNP MOZ MRT MSR MUS MWI MYS NAM NCL NER NFK NGA NIC NIU NLD NOR 
##   5   5   5   5   1   5   5   1   5   5   5   5   1   5   1   5   1   1   5   5 
## NPL NRU NZL OMN PAK PAN PCN PER PHL PLW PNG POL PRI PRK PRT PRY PSE PYF QAT ROU 
##   5   5   5   5   5   5   1   5   5   5   5   5   1   5   5   5   1   1   5   5 
## RUS RWA SAU SDN SEN SGP SGS SHN SLB SLE SLV SMR SOM SPM SRB SSD STP SUR SVK SVN 
##   5   5   5   1   5   5   1   1   5   5   5   1   1   1   5   1   5   1   5   5 
## SWE SWZ SXM SYC SYR TCA TCD TGO THA TJK TKM TLS TON TTO TUN TUR TUV TWN TZA UGA 
##   5   5   1   5   1   1   5   5   5   1   5   5   5   1   5   5   5   1   5   5 
## UKR URY USA UZB VAT VCT VEN VGB VIR VNM VUT WLF WSM YEM ZAF ZMB ZWE 
##   5   5   5   5   1   1   1   1   1   5   5   1   5   5   5   5   5

description of the cleaned data

Entity refers to the name of the country. Code refers to the OWID internal entity code that we use if the entity is a country or region. Year refers to the years of the prevalence. Prevalence.of.current.tobacco.use….of.adults. refers to the prevalence of current tobacco users.

Initial visualization

subset_data <- plot_data %>% filter(Year %in% c(2000, 2005))
subset_data2 <- plot_data %>% filter(Year %in% c(2010, 2015, 2020))

combined_data <- bind_rows(subset_data, subset_data2)


# Plot using combined data
plot <- plot_ly(
  data = combined_data,
  type = "choropleth",
  locations = ~iso_a3,
  locationmode = "ISO-3",
  z = ~Prevalence,
  frame = ~Year,
  text = ~paste("Country:", Entity, "<br>Prevalence:", Prevalence, "%"),
  colorscale = "Reds",
  zmin = 0,
  zmax = 68.5,
  showscale = TRUE
) %>%
  layout(
    title = "Global Smoking Prevalence (Subset Test)",
    geo = list(
      projection = list(type = "mercator"),
      showcoastlines = TRUE,
      coastlinecolor = "grey"
    )
  )
plot

final visualization

plot <- plot_ly(
  data = combined_data,
  type = "choropleth",
  locations = ~iso_a3,
  locationmode = "ISO-3",
  z = ~Prevalence,
  frame = ~Year,
  text = ~paste(
    "Country:", Entity, 
    ifelse(is.na(Prevalence), "<br>No Data", paste0("<br>Prevalence: ", Prevalence, "%"))
  ),
  colorscale = list(
    c(0, "#ffeda0"),   # Low prevalence: light orange
    c(0.5, "#feb24c"), # Medium prevalence: orange
    c(1, "#67000d")    # High prevalence: dark red
  ),
  zmin = 0,
  zmax = 68.5,
  showscale = TRUE,
  marker = list(line = list(color = "grey", width = 0.5))  # Border for the countries
) %>%
  layout(
    title = "Global Smoking Prevalence (Subset Test)",
    geo = list(
      projection = list(type = "equirectangular"),  # Rectangular projection
      showcoastlines = TRUE,                # Show coastlines
      coastlinecolor = "grey",              # Set coastline border color
      showcountries = TRUE,                 # Ensure country borders are shown
      countrycolor = "grey",                # Set country border color
      showland = TRUE,                      # Show land explicitly
      landcolor = "white",                  # Set land colour to white
      showocean = TRUE,                     # Enable ocean rendering
      oceancolor = "lightblue",             # Set ocean colour to light blue
      showframe = FALSE                     # Optionally remove frame border
    ),
    annotations = list(
      list(
        x = 0.5,                            # Position for the note (to the right of the map)
        y = -0.1,                            # Vertical position (lower part of the map)
        xref = "paper",                     # Reference the x-axis relative to the paper
        yref = "paper",                     # Reference the y-axis relative to the paper
        text = "Note: White regions indicate missing data.", # Your note
        showarrow = FALSE,                  # Disable arrow pointing
        font = list(size = 12, color = "black"), # Font size and color
        align = "left"
      )
    )
  )

# Display the plot
plot

##final final plot

plot <- plot_ly(
  data = combined_data,
  type = "choropleth",
  locations = ~iso_a3,
  locationmode = "ISO-3",
  z = ~Prevalence,
  frame = ~Year,
  text = ~paste(
    "Country:", Entity, 
    ifelse(is.na(Prevalence), "<br>No Data", paste0("<br>Prevalence: ", Prevalence, "%"))
  ),
  colorscale = list(
    c(0, "#ffeda0"),   # Low prevalence: light orange
    c(0.5, "#feb24c"), # Medium prevalence: orange
    c(1, "#67000d")    # High prevalence: dark red
  ),
  zmin = 0,
  zmax = 68.5,
  showscale = TRUE,
  marker = list(line = list(color = "grey", width = 0.5))  # Border for the countries
) %>%
  layout(
    title = "Global Smoking Prevalence (Subset Test)",
    geo = list(
      projection = list(type = "equirectangular"),  # Rectangular projection
      showcoastlines = TRUE,                # Show coastlines
      coastlinecolor = "grey",              # Set coastline border color
      showcountries = TRUE,                 # Ensure country borders are shown
      countrycolor = "grey",                # Set country border color
      showland = TRUE,                      # Show land explicitly
      landcolor = "white",                  # Set land colour to white
      showocean = TRUE,                     # Enable ocean rendering
      oceancolor = "lightblue",             # Set ocean colour to light blue
      showframe = FALSE                     # Optionally remove frame border
    ),
    annotations = list(
      list(
        x = 0.5,                            # Position for the note (to the right of the map)
        y = -0.1,                            # Vertical position (lower part of the map)
        xref = "paper",                     # Reference the x-axis relative to the paper
        yref = "paper",                     # Reference the y-axis relative to the paper
        text = "Note: White regions indicate missing data.", # Your note
        showarrow = FALSE,                  # Disable arrow pointing
        font = list(size = 12, color = "black"), # Font size and color
        align = "left"
      )
    )
  )

# Display the plot
plot